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Aide  memoire  -  Spatial  sampling 


Introduction 

The  land  surface,  the  materials  of  which  it  is  composed  and  the  environment  more  generally  are 
continuous.  In  general  measurements  or  observations  can  be  made  on  only  small  portions  of  them, 
i  e.  on  samples,  because  of  the  large  areas  involved.  For  example,  in  a  smgle  agricultural  field  there 
is  an  infinite  number  of  potential  sampling  points.  Samples  intended  to  represent  the  areas  from 
which  they  are  drawn  must  be  planned  with  care.  The  information  from  a  sample  location  should 
represent  l  surrounding  area,  the  extent  of  which  we  might  not  know.  Smce  many  environmental 
properties  vary  locally  in  a  complex  and  erratic  way  the  values  from  a  smgle  sampling  pomt 
include  a  sampling  effect.  To  increase  the  information  from  a  sampling  location  so  that  it  is 
representative  a  bulked  sample  can  be  taken,  and  provided  that  the  property  is  additive  the 
measurement  made  on  it  wiU  equal  the  regional  mean  apart  from  samplmg  error. 

At  the  outset  consider  the  use  that  will  be  made  of  the  sample  information  For  instance,  will  the 
mean  values  of  the  properties  observed  for  the  entire  area  or  for  strata  within  the  area  be  used  to 
predict  at  unsampled  places?  Or  will  the  information  be  used  to  predict  locally,  either  using 
Zhe^fSerpoLors  or  geos.afls.ical  ones.  For  either  of  the  latter  fhe  sample  data  must  he 
spatially  autocorrelated  for  them  to  have  any  merit. 

This  aide  lists  the  matters  that  must  be  considered  and  resolved  in  planning  samplmg  of  a 
geographic  region,  which  for  present  purposes  we  treat  as  two-dimensional. 

Defining  the  target 

The  domain 

The  domain  is  the  region  of  interest.  Circumscribe  it  by  a  boundary  on  a  map  so  that  every  point 
can  be  assigned  to  the  domain  or  not  with  certainty.  The  domain  may  comprise  a  smgle  parcel  of 

land  or  several.  Denote  it  by  D. 

Support 

The  support  is  the  area  or  volume  of  material  on  which  you  make  measurements.  It  has  size  and 
shape,  and  may  have  orientation.  In  remote  sensing  it  is  the  footprint  of  the  pixel,  in  vege  a 
surveys  it  is  the  quadrat;  in  soil  survey  it  is  the  core  of  soil  taken  from  the  ground.  Cores  of  soil 
may  be  taken  from  areas  larger  than  the  cross-section  of  the  cores  and  bulked  for  analysis  m  the 
laboratory.  In  these  cases  the  supports  are  the  larger  areas. 

In  any  one  survey  define  the  support  and  keep  it  constant  throughout. 


The  population  and  units 

Within  D  are  units  that  have  the  dimensions  of  the  supports.  In  a  remote  image  their  number  is 
finite  though  large.  In  soil  survey  they  are  so  many  that  they  may  be  regarded  as  infinite.  Define 
them  by  their  spatial  coordinates  and  their  spatial  extents.  Together  they  comprise  the :  populatiom 
The  terns  ‘population’  and  ‘units’  may  be  used  to  refer  to  the  values  of  a  variable  of  the  supports. 

The  target 

Within  D  there  may  be  only  certain  kinds  of  terrain  or  land  use  that  are  of  interest,  e.g  only  dry 
land(not  water),  oriy  farm  land  (not  towns,  not  parks,  not  golf-courses,  etc.).  The  units  falling  in 
these  classes  constitute  the  target  population.  The  others  do  not  belong. 

Samples 

Whole  populations  cannot  be  measured  in  ground  survey;  you  can  measure  only  subsets  of  the  units 
that  comprise  them.  Such  a  subset  of  units  is  a  sample. 

Typically  you  will  want  two  characteristics  in  a  sample  -  accuracy  and  reliability.  The  first  means 

th£  a  sample  represents  the  population  without  bias  i.e.  any  value  **  ™ ^Sd' 
will  be  as  likely  to  exceed  the  true  value  of  the  population  as  it  will  be  to  fall  short.  The  second 
implies  that  repeated  sampling  will  give  sensibly  the  same  result.  It  is  measured  by  the  estimation 
variance  or  standard  error  of  the  mean,  s.e. 

These  characteristics  can  be  assured  by  the  sampling  design  in  which  there  is  sufficient 
randomness. 


Notation 

We  adopt  the  following  basic  notation. 

D  denotes  the  domain. 

\D\  is  the  area  of  D. 
z  is  the  variable  of  interest. 

Z  is  a  random  variable. 

Z(x)  is  a  random  process,  random  field,  stochastic  process,  in  which  Z  may  take  any  one  of  two  or 
more  values  at  random  at  each  point  x  in  D. 

N  is  the  size  of  the  sample  in  D,  i.e.  the  number  of  units  in  it. 

Dk  denotes  the  kth  subdivision  of  D,  of  which  there  may  be  K. 


nk  is  the  number  of  units  in  a  sample  of  D*. 

H  denotes  the  mean  of  z  in  D. 

z  is  the  mean  of  the  N  data  drawn  from  D. 
c?  is  the  variance  of  z  in  D. 

s2=  a1  is  the  estimate  of  o2  from  the  N  data. 

s2(D)  is  the  estimation  variance  of// in  D. 

s(D)  is  the  standard  error  of  //. 

h  denotes  the  lag  separating  two  places,  and  is  a  vector  in  two  dimensions;  |h|  is  the  distance 
component  of  the  lag. 

j<h)  signifies  the  semivariance  at  lag  h. 

Ai  are  the  kriging  weights. 


Sampling  designs  for  design-based  estimation 

This  is  essentially  the  classical  statistical  approach  to  sample  design  and  prediction. 

Simple  random  sampling 

In  simple  random  sampling  AT  units  are  chosen  with  equal  probability  from  the  target  population. 
The  result  is  unbiased,  and  the  estimation  variance  s  (D)  is  given  by  nT/N. 

If  there  is  any  spatial  correlation  at  the  working  scale  then  this  is  inefficient  in  the  sense  that  the 
same  estimation  variance  could  be  achieved  with  a  smaller  sample  by  a  better  design. 

Stratified  random  sampling 

Divide  the  region  into  strata,  Dk,  *=1,2, . . ,  K,  and  represent  each  by  a i  few  units,  Really  two 
chosen  at  random  independently.  The  sizes  nk  may  be  chosen  in  proportion  to  the  areas  of  the  Dk, 

\Dk\,  if  they  are  not  equal. 

If  other  sizes  are  chosen  then  the  mean  in  D  may  be  calculated  as  the  weighted  average  of  the 
individual  stratum  means  with  weights  proportional  to  the  \Dk\  The  estimation  varnnce ^of  stratffi 
sampling  depends  on  the  variance  within  the  strata,  or  the  pooled  within  stratum  variance  , 
presence  of  spatial  dependence  the  latter  is  less  than  the  total  variance  in  the  population,  and 
stratified  sampling  is  more  efficient  than  simple  random  sampling. 


The  estimation  variance  is  given  by 


s\D)smi fied  =  £w*V(ft), 

k= 1 

where  /(ft)  is  the  estimation  variance  within  stratum  ft,  and  w*  is  the  weight  assigned  to  the 
stratum.  The  weights  should  sum  to  1  to  avoid  bias. 

There  are  numerous  ways  in  which  this  general  scheme  can  be  elaborated  according  to  what  you 
know  of  the  region  and  the  variation  within  it.  For  example  the  strata  could  have  unequal  spatial 
extents  as  in  classification.  In  this  case  the  different  areas  are  taken  into  account  through  the  weight 
M>k,  such  that 

area  of  stratum  k 
k  total  area 


Systematic  sampling 

Sampling  is  usually  most  efficient  when  done  on  a  regular  grid.  It  has  two  disadvantages: 

(1)  it  provides  no  ready  estimate  of  the  variance; 

(2)  it  may  lead  to  biased  estimates  of  the  mean. 

The  first  arises  because  once  the  origin  and  orientation  of  the  grid  are  decided  there  is  no  farther 
randomization  possible.  It  is  not  easily  overcome,  but  the  estimation  variance  may  be  approximated 
by  methods  such  as  Yates's  balanced  differences. 

The  second,  bias,  can  happen  where  there  is  trend  or  periodicity  in  z  in  the  region.  Periodicity  is 
usually  evident,  and  if  it  is  then  you  can  choose  an  interval  and  orientation  that  will  be  out  of  tune 
with  it.  Alternatively,  choose  a  non-aligned  scheme  in  which  each  sampling  point  on  the  grid  is 
offset  from  its  node  by  a  random  distance  along  its  row  and  down  its  column  according  to  a  rule. 


Sample  size 

The  size  of  sample  N  may  depend  on  the  budget  or  the  tolerance,  i.e.  error  that  can  be  tolerated  in 
the  estimate  from  the  survey.  If  the  budget  is  fixed  then  choose  a  stratified  scheme  to  minimize 
the  error  for  that  budget. 

If  the  error  is  specified  as  s(D)  then  for  simple  random  sampling 


N  =  s2/s2(D)  , 

The  formula  for  stratified  sampling  is  more  elaborate. 


You  usually  do  not  know  s2  in  advance,  and  so  choosing  N  is  problematic.  Therefore  sample  in 
stages  starting  with  a  sparse  design  that  can  be  intensified  as  necessary.  At  each  stage  calculate  the 
estimation  variance  to  see  whether  it  meets  the  tolerance.  If  it  does  then  stop;  otherwise  intensify 
the  sampling  and  recompute  the  estimation  variance  as  the  next  stage. 


Geostatistical  (model-based)  sampling  design  and  prediction 

Geostatistics  is  used  to  estimate  local  values  rather  than  regional  ones,  i.e.  to  predict.  It  is  based  on 
the  assumption  that  z  in  the  real  world  is  a  realization  of  the  random  process  Z(x).  For  this  reason 
there  is  no  need  to  randomize  the  sampling,  and  grid  sampling  is  preferred  because  of  its  efficiency. 

Geostatistical  prediction  (kriging)  requires  a  model  of  the  correlation  structure,  expressed  either  as 
a  covariance  limction,  or  rather  more  generally  as  a  variogram.  Like  the  variance  m  design-based 
estimation,  these  functions  are  not  known  a  priori  and  must  be  estimated  from  sample  data. 
Sampling  must  therefore  serve  two  purposes: 

(1)  estimation  and  modelling  of  the  variogram,  and 

(2)  local  prediction  once  the  variogram  has  been  estimated  and  modelled. 

To  satisfy  item  (1)  sampling  must  be  sufficient  to  estimate  the  semivariances  precisely.  It  must  also 
be  dense  enough  to  estimate  the  spatial  characteristics  of  the  variation,  such  as  correlation  range 
and  general  form  of  the  variogram. 

Sampling  for  item  (2)  will  depend  either  on  the  budget  or  on  the  tolerable  error  of  local  predictions 
and  the  variogram. 

Sampling  to  estimate  the  variogram 

Nested  sampling  and  analysis 

Start  with  nested  sampling  and  a  hierarchical  analysis  of  variance  of  the  sample  data  if  you  know 
nothing  of  variation  in  the  region.  Choose  five  or  six  sampling  intervals  in  geometric  progression 
from  the  smallest  lag  distance  of  interest  to  the  largest.  Choose  the  angular  separations  at  random. 
Replicate  at  the  longer  distances  to  give  sufficient  degrees  of  freedom  in  the  analysis  of  variance  to 
estimate  the  components.  Expect  to  have  a  total  sample,  N,  of  about  100.  Figure  1  shows  the  kind 
of  sampling  plan  to  aim  for. 

Accumulate  the  components  of  variance  to  estimate  ?(|h|)  at  the  distances  of  the  design  and  draw  a 
crude  variogram  with  the  logarithm  of  |h|  on  the  abscissa  as  in  Figure  2. 


Variance 


l_ - — - 1 

Figure  1.  The  plan  of  sampling  for  one  main  centre  in  a  nested  survey  with  7  stages.  The  stages 
the  hierarchy  are  given  for  each  sampling  site. 


Figure  2.  The  accumulated  components  of  variance  from  a  hierarchical  analysis  of  variance 
giving  a  first  approximation  to  the  variogram. 


Such  a  result  this  can  be  used  to  identify  the  range  of  distance  within  which  most  variance  occurs 
and  to  plan  further  sampling  to  estimate  the  conventional  variogram. 


If  all  the  variance  appears  to  occur  within  the  smallest  distance  of  interest  then  local  prediction  is 
not  feasible.  So  stop!  Figure  3  shows  an  example  of  a  pure  nugget  reconnaissance  variogram. 
of  the  variation  is  occurring  within  the  shortest  sampling  interval. 


Figure  3.  A  pure  nugget  reconnaissance  variogram  from  a  nested  survey. 


Estimating  the  variogram  parameters 

Use  the  result  from  the  hierarchical  analysis  above  or  other  knowledge  of  the  variation  in  D  to 
estimate  semivariances,  *h),  at  several  lags,  h,  within  the  correlation  range.  Des^  ^  fctemewth 
approximately  100  to  150  sampling  points  if  the  variation  appears  isotropic.  If  a  square  grid  with 
thfa  number  gives  you  sufficient  estimates  of  *h)  within  the  correlation  range  then  use  it.  If  not 
then  cluster  the  sampling  in  some  way.  Intensify  sampling  around  a  subset  of  grid  nodes  bearmg  m 
mind  that  you  are  likely  to  want  a  grid  for  kriging  later.  Alternatively,  samp  e  in  clu^ters  ^h^ 
range  of  sampling  distances  between  locations,  and  spread  the  clusters  evenly  over  D  so  that  the 


Do  not  cluster  sampling  in  parts  of  D  that  you  know  or  suspect  to  have  ^USU^J  ^  ^eS  °f  Z 
(as  you  might  in  mineral  surveys  or  pollution  studies)  or  unusually  small  ones  (as  m  studies  of 

deficiency  diseases).  This  will  result  in  bias. 


If  the  estimated  values  fall  close  to  a  smooth 
it,  estimate  its  parameters,  and  proceed  to 

kriging. 


Compute  the  sample  variogram  and  plot  the  result, 
curve  then  choose  an  authorized  model  to  describe 


If  there  is  too  much  scatter  to  identify  a  plausible  function  then  increase  the  sampling,  either  by 
intensifying  the  grid  or  by  adding  clusters,  and  recompute  the  vanogram.  Repeat  until  a  smoo 

form  is  identifiable. 


If  the  variation  is  anisotropic  and  yon  wish  to  model  the  anisotropy  then  you  must  expect  to  sample 
at  200  points  or  more. 


Kriging 

In  kriging  Z  at  an  unknown  point  x0  minimize  the  prediction  variance 


cr2(x0)  =  2^A,.y(Xo  xj) 

;=1  7=1 


(1) 


where  n«Nis  the  number  of  sampling  points  near  to  the  target  point  x0.  The  quantities 
y(X[  -x/)  and  r(x0  ~ x, ) depend  on  the  separations x,  -  xy  and  x0  -  x,;  the  larger  these  are  the 

larger  is  c/(x0). 


The  maximum  value  of  <**,)  is  minimized  by  sampling  on  a  regular  grid.  A  tna^utar  grid  is 
usually  the  most  efficient,  but  rectangular  grids  are  almost  as  good  (Figure  3a),  and  as  they  are 
easier  to  lay  out  and  document  they  are  preferred.  If  variation  is  isotropic  then  use  a  square  grid. 


If  the  budget  is  fixed  then  sample  as  intensely  as  it  permits.  If  a  maximum  toler  ance  is  specified 
say  SKtnax,  then  solve  the  kriging  system  for  a  range  of  sampling  intensities  (grid  intervals)  and  plot 
the  kriging  variance  (or  its  square  root,  the  kriging  error)  on  the  ordmate  against  the  grid  interal 
the  abscissa.  Connect  the  points  by  a  smooth  line.  Figure  3.  From  5  k™x,  or  .W,  draw  a  honzo 
line  until  it  meets  the  curve,  and  from  that  intersection  drop  a  perpendicular.  The  value  at  which  the 
perpendicular  cuts  the  abscissa  is  the  required  sampling  interval.  Figure  3b. 

Determine  the  number  of  cores  in  bulked  samples  similarly.  Compute  the  estunation  variances 
using  Equation  (1)  for  equispaced  sampling  configurations  and  sample  sizes  from  4  to  about  50  and 
join  the  values  to  form  a  curve  (Figure  4).  Draw  a  horizontal  lme  at  the  maximum  tolerable 
variance,  and  drop  a  perpendicular  from  the  point  at  where  it  intercepts  the  curve  to  the  abscissa. 
The  value  on  the  abscissa  is  the  optimum  size  of  sample. 
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Figure  3.  Kriging  variances  from  (a)  punctual  kriging,  and  (b)  block  kriging. 
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Figure  4.  Graphs  of  standard  error  plotted  against  sample  size  for  bulking  from  4,  9, 16,  25,  36  and 
49  cores,  and  for  three  different  sample  supports. 


